Aspects on speech perception as the model for ASR
نویسنده
چکیده
Over the last decades it has become increasingly popular to adopt inspiration from knowledge on human language skills in applications within speech technology. It has proven to be successful in some aspects, such as signal processing in automatic speech recognition (ASR) but the overall performance is still far behind that of humans. Perhaps problems within speech technology can be solved with improved engineering but a closer collaboration between technology and cognitive science might lead to insight on more fundamental principles involved in language processing that could turn out useful in speech technology.
منابع مشابه
An Introduction to Decision Rules For Automatic Speech Recognition
Modern automatic speech recognition ASR technology is based on a communication theoretical view of the generation acquisition and transmission and perception of speech The goal of speech recognition is then de ned as recovering the intended sequence of linguistic units from the observed acoustic signal Under a statistical decision theoretic formulation ASR is formulated as a statistical decisio...
متن کاملپایهگذاری بستری نو و کارآمد در حوزه بازشناسی گفتار فارسی
Although researches in the field of Persian speech recognition claim a thirty-year-old history in Iran which has achieved considerable progresses, due to the lack of well-defined experimental framework, outcomes from many of these researches are not comparable to each other and their accurate assessment won’t be possible. The experimental framework includes ASR toolkit and speech database ...
متن کاملDNN-Based Automatic Speech Recognition as a Model for Human Phoneme Perception
In this paper, we test the applicability of state-of-the-art automatic speech recognition (ASR) to predict phoneme confusions in human listeners. Phoneme-specific response rates are obtained from ASR based on deep neural networks (DNNs) and from listening tests with six normal-hearing subjects. The measure for model quality is the correlation of phoneme recognition accuracies obtained in ASR an...
متن کاملASR Systems as Models of Phonetic Category Perception in Adults
Adult speech perception is tuned to efficiently process native phonetic categories, causing difficulties with certain non-native categories. For example, Japanese has no equivalent of the distinction between American English /r/ and /l/ and native speakers of Japanese have a hard time discriminating between these two sounds. Here, we ask whether standard Automatic Speech Recognition (ASR) syste...
متن کاملمدل میکروسکوپی دوگوشی مبتنی بر فیلتر بانک مدولاسیون برای پیش گویی قابلیت فهم گفتار در افراد دارای شنوایی عادی
In this study, a binaural microscopic model for the prediction of speech intelligibility based on the modulation filter bank is introduced. So far, the spectral criteria such as the STI and SII or other analytical methods have been used in the binaural models to determine the binaural intelligibility. In the proposed model, unlike all models of binaural intelligibility prediction, an automatic ...
متن کامل